Molecular characterization of a Novel NAD+-dependent farnesol dehydrogenase SoFLDH gene involved in sesquiterpenoid synthases from Salvia officinalis

Salvia officinalis is one of the most important medicinal and aromatic plants in terms of nutritional and medicinal value because it contains a variety of vital active ingredients. Terpenoid compounds, particularly monoterpenes (C10) and sesquiterpenes, are the most important and abundant among these active substances (C15). Terpenes play a variety of roles and have beneficial biological properties in plants. With these considerations, the current study sought to clone theNAD+-dependent farnesol dehydrogenase (SoFLDH, EC: 1.1.1.354) gene from S. officinalis. Functional analysis revealed that, SoFLDH has an open reading frame of 2,580 base pairs that encodes 860 amino acids.SoFLDH has two conserved domains and four types of highly conserved motifs: YxxxK, RXR, RR (X8) W, TGxxGhaG. However, SoFLDH was cloned from Salvia officinalis leaves and functionally overexpressed in Arabidopsis thaliana to investigate its role in sesquiterpenoid synthases. In comparison to the transgenic plants, the wild-type plants showed a slight delay in growth and flowering formation. To this end, a gas chromatography-mass spectrometry analysis revealed that SoFLDH transgenic plants were responsible for numerous forms of terpene synthesis, particularly sesquiterpene. These results provide a base for further investigation on SoFLDH gene role and elucidating the regulatory mechanisms for sesquiterpene synthesis in S. offcinalis. And our study paves the way for the future metabolic engineering of the biosynthesis of useful terpene compounds in S. offcinalis.

It is well-known that Terpenoid form is the major cluster of natural compounds and a set of secondary metabolites, that have been identified from plant kingdom and other organisms with more than (>40,000) different structures [9]. Terpenoid derives its structural from the isopentenyl diphosphate (IPP), which involves five carbon atoms (C5) [10,11]. However, the origin name of these different structures arises from the terebinth tree (Pistacia terebinthus), all of these identified as different structures terpene. The structure of the terpene unit was illustrated by Wallach and Altered by Ruzicka [12][13][14][15]. Some terpenes are related to the plant primary metabolism such as the carotenoid pigments, phytol side chain of chlorophyll, gibberellin plant hormones, and phytosterols of cellular membranes [16,17], which are important for development and flower blooming of the plants. However, wide ranges of terpenes have been identified as categories of secondary metabolites with essential properties in the adaptation of plants to the stresses. Nonvolatile and volatile terpenes have an important role in the predators of herbivores and defense against photo-oxidative stress, haul of both pollinators and the direct defense against insects and microbes [18]. At present, several studies are focused to understand in-depth the mechanisms of terpene and its functions [3,4].
It is prominent that salvia species involved high proportion of the essential oil; this fragrant oil mainly contains monoterpenes and sesquiterpenes. The composition of the monoterpenes and sesquiterpenes are differ depends on the plant species and cultivars, in addition to the type of tissues [3,4,[19][20][21][22][23][24]. Biosynthesis gene for the sesquiterpene has remained elusive until recently.
The main sesquiterpene in the Egyptian cultivar of S. officinalis are Isocaryophyllene, α-caryophyllene, Caryophyllene oxide and (−)-Germacrene D, which are encoded by three unigenes families [3]. So far, their biological or physiological functions have been widely unclear. This makes the enzymes that stimulate the formation both interesting and functionally difficult to differentiate.
Through this research, we focus to clone and functionally expressed the S. officinalis NAD+-dependent farnesol dehydrogenase (SoFLDH, EC: 1.1.1.354) gene in Arabidopsis thaliana. The recombinant SoFLDH catalyzed the conversion of Farnesyl pyrophosphate (FPP) to the sole produce various types of terpene especially sesquiterpene. SoFLDH protein displayed a distinctive amino acid sequence, with highly preserved motifs, including the YxxxK, RXR, RR (X 8) W and TGxxGhaG motifs. Finally, this study reveals to use the protein modeling database to investigate the performance of the 3D structure protein and its function predict.

Plant materials and tissue collection
Salvia officinalis seeds were kindly provided by the staff member of Egyptian Desert Gene Bank (EDGB) of Desert Research Center (DRC), Egypt. S. officinalis seeds had been growing in our growth chamber at National Research Centre (Cairo, Egypt), at temperature of 22ºC day/20ºC night with humidity of 50-70%, and photoperiod at 16 h day/8 h night, with a light density of 100-150 μmoles m 2s 1using fluorescent bulbs. For gene amplification, young leaves were picked up and instantly cast in liquid nitrogen and stowed at −80˚C until RNA extraction.

RNA extraction and cDNA library preparation
Young leaves of S. officinalis were used to extract the total RNA using TransZol Reagent (Focus Bioscience, Australia) according to the manufacturer's instructions and cured with DNase I (Takara). RNA quality was performed on 1.4% Agarose gel, and the clarity was analyzed using a Nanodrop ND1000 (NanoDrop technologies, Wilmington, DE, USA). RNA pools were primed for cDNA libraries using mixing equal volumes from the three RNAs replications in one tube. Two micrograms of total RNA (800 ng approximately) per sample was used for the synthesis of total cDNA with TransScript 1 First-Strand cDNA Synthesis Super-Mix (TransGen Biotech, Beijing, China) according to the manufacturer's instructions. Afterwards, PCR was performed for cDNA synthesis at 42˚C for 15 min followed by 85˚C for 5 min [3,4].

QRT-PCR, semiquantitative RT-PCR analysis and Western Blot (WB)
Quantitative RT-PCR was performed by an IQTM5 Multicolor Real-Time PCR Detection System (Bio-Rad, USA) as described previously Ali et al., and Hussain et al., [3,4,26,27] with SYBR Green Master (ROX) (Newbio Industry, China) following the manufacturer's instructions at a total reaction volume of 20 μl. A gene-specific primer for SoACTIN forward 5 0 -GGCAGTTCTCTCCCTCTAT-3 0 and reverse 5 0 -GAGGTGGTCGGTGAGAT-3 0 was used as a reference gene with 157 bp, and SoFLDH forward 5 0 -TTCCTGATCCCTCCAGATT-3 0 and reverse 5 0 -CAATGTAGCCATCCGTTGA-3 0 with 153 bp length. Moreover, semiquantitative real-time PCR was achieved on a Biometra PCR (Biometra T Gradient Thermo block PCR Thermocycler, American Laboratory Trading, San Diego, CA) system with a total reaction volume of 25 μl. A gene-specific primer for At-B-actin forward 5 0 -GGCTGAGGCTGATGA TATTC-3 0 and reverse 5 0 -CCTTCTGGTTCATCCCAAC -3 0 was used as a reference gene with 155 bp and the same forward and reverse primers for SoFLDH, all the primers were designed using the primer designing tools of IDTdna (http://www.idtdna.com/scitools/ Applications/RealTimePCR/). The semiquantitative RT-PCR conditions were as follows: predenaturation step at 95˚C for 4 min, 35 cycles of amplification (95˚C for 30 s, 58˚C for 30 s and 72˚C for 1 min), and a final extension step at 72˚C for 10 min. The PCR products were resolved on 1.3% agarose gel, and the expression levels of At-B-actin and SoFLDH genes were detected. On the other hand, for confirmed the transformation stability of SoFLDH protein was isolated and detected from various lines of transgenic and wild-type A. thaliana using Western blotting (WB), fresh leaves were homogenized with pestles in prechilled mortars on ice in 1 ml of cold homogenization buffer (100 mM Tris, pH 7.5, 10% sucrose, 5 mM sodium EDTA, and 5 mM sodium EGTA) for each one gram of tissue as described by Ma et al., [28].

Full-length terpene synthase cDNA clone and vector
Full-length cDNAs sequence for SoFLDH was obtained based on RNA-Seq sequence information from our transcriptome sequencing of S. officinalis plant leaves [3], and SoFLDH was amplified from cDNA of young leaf using short and long gene-specific gene primers based on the Gateway pDONR221 vector manual system. The initial PCR amplification was performed by short primers, of SoFLDH forward 5 0 -ATGTGGGGATTAGGTGGGAGT -3 0 and reverse 5 0 -TCAATCATGTCACTCACTCACTCAA -3 0 using the KOD-Plus-Neo DNA polymerase (Novagen) under the following PCR conditions: 3 min at 96˚C followed by 10 s at 98˚C; 30 s at 60˚C (Annealing temperatures), 1.5 min at 68˚C, and then 10 min at 68˚C. This process was repeated for 33 cycles. The first PCR products was used as a template for the second PCR using long primers, SoFLDH forward 5 0 -GGGGACAAG TTTGTACAAAAAAGCAGGCTTCATGTGG GGATTAGGTGGG-3 0 and reverse 5 0 -GGGGACCACTTTG TACAAGAAAGCT GGGTTCAAT CATGTCACTCACT-3 0 . BP Clonase (Invitrogen, USA) was used to insert the PCR products into the Gateway entry vector pDONR221. The positive pDONR221-SoFLDH constructs harbouring target genes were sequenced and ligated with the destination vector pB2GW7 using Gateway LR Clonase (Invitrogen, USA), then the positive pB2GW7-SoFLDH was used for A. thaliana plant transformation [3,4].

Arabidopsis plant growth conditions and preparation of Agrobacterium cultures for floral-dip transformation
The ecotype of A. thaliana seeds Columbia-0 (Col-0) has prepared for germination by adding 1.2 ml sterilized -water for seeds at a 2.0 ml Eppendorf tube, then nursed at~4˚C for three days at the refrigerator. Then A. thaliana seeds had been growing in a growth chamber with humidity of 60-70% under the day and night temperature of 22˚C day/20˚C night, using a light density of 100-150 mol m −2 s −1 using fluorescent bulbs and photoperiod at 16 h day/8 h night. For floral-dip transformation, plants at two-month age were used, and one week after, the primary inflorescences were clipped. The Plant watering was stopped at four days before the transformation to increase and improve the transformation efficiency. In addition, the constructs of pB2GW7-SoFLDH were introduced into Agrobacterium tumefaciens strain GV101 by direct electroporation. Recombinant GV101 was grown for 48 h at 28˚C in solid LB media supplemented with 60 μg/ml of each rifampicin and spectinomycin. An individual colony was injected into 0.8 ml of liquid medium and grown at 28˚C under 180 rpm agitation overnight with the same media composition. After 24 h, 0.8 ml of each sample of liquid medium was relocated to a 300 ml conical flask containing 60 ml of LB media supplemented with the same compositions; the samples were grown at 28˚C in a shaker overnight until an optical density of 0.75 (OD 600 ) was reached. Overnight cell cultures were harvested by centrifugation at 4,000 rpm for 12 min at 4˚C, and the pellet was resuspended in the floral-dip inoculation medium contained 5.2% sucrose and 0.055% Silwet. A. thaliana was transformed by drenched the secondary inflorescences in the inoculation medium and stirred softly to allow the intake of Agrobacterium harbouring the pB2GW7-SoFLDH vector into the flower gynoecium. The transformed plants were kept in the dark and covered by plastic cover overnight to maintain humidity. After 24 h, the plants were returned back to their normal growth conditions. The transformation was repeated after 7-10 days to increase the transformation efficiency. Plants were grown for additional 30-37 days, until all of the siliques became brown and dry. The seeds were harvested and stored at~4˚C under desiccation [3,4,29,30]. BASTA was used for selection of transformant seedlings which were also confirmed with PCR for positive transgenic lines, more than 12 positive plant lines with selective gene were analysed for terpenoid profiling and target gene expression.

Phenotypic evaluation
Transgenic plants were watered and fertilized regularly with Miracle Gro fertilizer (Scott's Company, USA) prepared according to manufacturer's instructions for phenotypic comparisons between A. thaliana transgenic lines and their counterpart wild-type plants. Plants were grown in the growth chamber under the previously reported conditions for vegetative growth and flowering. Plants were assessed with regard to leaf morphology, flowering time, and terpene metabolic [4].

Metabolite extraction from transgenic A. thaliana leaves
Terpenoid compounds from overexpression of SoFLDH-A.thaliana and wild type were extracted and isolated. Thirty six leaves from transgenic A. thaliana line (three leaf from each plant) were grind in liquid nitrogen with a mortar and pestle to fine powder, and directly soaked in n-hexane as a solvent in Amber storage bottles, 30 ml screw-top vials with silicone/ PTFE septum lids (http://www.sigmaaldrich.com) were applied to diminish the loss of volatiles to the headspace then incubated with shaking at 37˚C and 200 rpm for 72 h. Afterward, the solvent was transmitted using a glass pipette to a 10 ml glass centrifuge tube with screw-top vials with silicone/PTFE septum lids and centrifuged at 5,000 rpm for 10 min at 4˚C to remove plant debris. The supernatant was pipette into glass vials with a screw cap and oil was concentrated until remaining 1.5 ml of concentrated oils under a stream of nitrogen gas with a nitrogen evaporator (Organomation) and water bath at room temperature (Toption-China-WD-12). The concentrated oils transferred to a fresh crimp vial amber glass, 1.5 ml screw-top vials with silicone/PTFE septum lids were used to diminish the loss of volatiles to the headspace. For absolute oil recovery, the remaining film crude oil in the internal surface of concentrated glass vials was dissolved in the minimum volume of n-hexane, thoroughly mixed and transferred to the same fresh crimp vial amber glass, 1.5 ml. And the crimp vial was placed on the auto sampler of the gas chromatography mass spectrometer (GC-MS) system for GC-MS analysis, or each tube was covered with parafilm after closed with screw-top vials with silicone/ PTFE septum lids and stored at -20˚C until GC-MS analysis [3,4,31].

GC-MS analysis of essential oil components
GC analysis was implemented using a Shimadzu model GCMS-QP2010 Ultra (Tokyo, Japan) system. 1μl aliquot of each sample was introduced (split ratios of 15:1) into a GC-MS equipped with an HP-5 fused silica capillary column (30 m x 0.25 mm ID, 0.25 μm film thicknesses). Helium was used as the carrier gas at a constant flow of 1.0 ml/min -1 . The mass spectra were monitored between 50-450 m/z. Temperature was initially under isothermal conditions at 60˚C for 10 minutes. Temperature was then increased at a rate of 4˚C/min -1 to 220˚C, held isothermal at 220˚C for 10 minutes, increased by 1˚C/ min -1 to 240˚C, held isothermal at 240˚C for 2 min, and finally held isothermal for 10 minutes at 350˚C. The identification of the volatile constituents was determined by parallel comparison of their recorded mass spectra with the data stored in the Wiley GC/MS Library (10 th Edition) (Wiley, New York, NY, USA), the Volatile Organic Compounds (VOC) Analysis S/W software, and the NIST Library (2014 edition). The relative percent amount of each component was calculated by comparing its average peak area to the total areas. All of the experiments were performed simultaneously three times under the same conditions for each isolation technique with total GC running time was 80 minutes [3,4,31].

In silico analysis of SoFLDH
The SoFLDH gene with 2,580 bp of open reading frame, which encodes a 860 amino acid and a 95.114 kDa of molecular mass and a 8.59 PI of theoretical isoelectric point (pI). The surmised amino acid sequence of SoFLDH showed signal peptide longer than monoterpene synthases (600-650 aa), and other sesquiterpene synthases of 550-580 aa. Furthermore, the presence of 30 amino acid long targets sequence using 'iPSORT' program suggested that SoFLDH protein was localized into the Mitochondrial and chloroplast where sesquiterpene and triterpene biosynthesis takes place. BLASTX analysis revealed that the Salvia splendens farnesol dehydrogenase was the nearest homologue gene to SoFLDH, with 91.74% identity, 91% Query cover and 0.0 E-value (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Also, from the phylogenetic tree analysis we found the surmised amino acid sequence of SoFLDH similar to farnesol dehydrogenase (TEY48599.1) gene from Salvia splendens, and other terpene alcohol dehydrogenases and benzyl alcohol dehydrogenases genes from different plant species (Fig 1). The function of SoFLDH gene was initially foretell depended on sequence alignment with well-known others terpene alcohol dehydrogenases and benzyl alcohol dehydrogenases genes sequences and other conserved motifs from Lamiaceae family and other plants (Fig 2). For instance, SoFLDH protein contained two highly conserved motifs YxxxK (residues 9-13) and (residues 585-589) motifs. Moreover, SoFLDH protein contained other commonly conserved RXR motif (residues 285-287), which is important for product cyclization in Class III TPS proteins [32][33][34] (Fig 2). Furthermore, SoFLDH protein contained another conserved motif such as, RR (X8) W (residues 405-415) region, which are located in most of the sesquiterpene synthases were

PLOS ONE
marked in (Fig 2). Also, SoFLDH protein contained another conserved motif TGxxGhaG (residues 440-447), and all these previous motifs have previously stated to flank the entrance of the active site. The nucleotide-binding motif TGxxxGhG is the highly conserved motif for coenzyme binding and the stabilizing the central β-sheet. Moreover, the YxxxK motif displayes the catalytic center [35] and remains responsible for the predilection for NADP(H) over NAD (H) [36,37]. Finally, each protein sequences have one or two or all of these conserved domains are belong to the terpene synthase family [3,4,11,[38][39][40].

Putative tissue expression pattern and subcellular localizations of SgCINS gene
We analyzed the putative SoFLDH gene expression profile maps based on Arabidopsis transcript expression for further understanding the functions of SoFLDH gene at different Arabidopsis tissues (Fig 3a-3c). It is clear from the Arabidopsis Electronic Fluorescent Pictograph Expression Profile Browser that the SoFLDH gene was expresses at all Arabidopsis tissues. And SoFLDH gene was highly expressed at Root then Petals, Sepals and Hypocotyl (Fig 3a). In context, GmTPS21, SoHUMS, SoLINS2, SoNEOD, SgTPSV, SgFARD and SgGERIS genes from G. max, S. officinalis and S. guaranitica were reported with higher expression levels in roots and seeds by liu et al., and Ali et al., [3,4,41], respectively. Moreover, the putative tissue specific stem epidermis for SoFLDH was analyzed and we found the highly expressed was record in top of stem epidermis more than the bottom of stem epidermis (Fig 3b). Furthermore, the putative subcellular localizations of SoFLDH was analyzed based on Arabidopsis protein localization for identified the SoFLDH synthesis sites at different cell organs (cell plate, cytoskeleton, cytosol, extracellular, golgi, endoplasmic reticulum, plasma membrane, mitochondrion, nucleus, peroxisome, plastid, unclear, unknown and vacuole) (Fig 3c). It is clear from the Arabidopsis Cell Electronic Fluorescent Pictograph subcellular localizations profiles that the SoFLDH gene was highly expressed and presented in plasma membrane then endoplasmic reticulum, vacuole, cytosol, mitochondrion, golgi pody and plastid (Fig 3c) [26,[42][43][44] who reported that most of TPSs genes were targeted to the plastid or other cell organelles such as mitochondrion and nucleus.

Tissue-specific expression of SoFLDH gene by quantitative RT-PCR
To determinate the organ-specific expression pattern of SoFLDH, we quantified the expression levels of SoFLDH transcripts in S. officinalis young leaves, old leaves, stems, bud flowers, flowers and roots tissues using qPCR-PCR (Fig 4). From our results, we found the SoFLDH gene is expressed in all tissue with distinct expression patterns. In old leaves, SoFLDH transcripts gene showed the highest expression levels, followed by stems, flowers, bud flowers, young leaves and roots (Fig 4). Similar results were obtained by Ali et al., [4] of which, the highest expression for sesquiterpene gene encoded by Selinene synthase (SgTPS-3) was reported in old leaves, followed by bud flowers [4].

The 3D structure of SoFLDH protein
SoFLDH protein sequence contains large, conserved domain, which was identified using the InterPro protein sequence analysis & classification (https://www.ebi.ac.uk/interpro/) database. In context that, The SoFLDH protein with a 860-aa length has an NAD(P)-binding domain superfamily domain (IPR036291) from 433-625 aa, and this previous domain have overlapping entries with other domains superfamily, such as Oxidoreductase, N-terminal (IPR000683), Semialdehyde dehydrogenase, NAD-binding (IPR000534), Lactate/malate

Functional characterization of SoFLDH gene in transgenic A. thaliana leaves
The function and specificity of SoFLDH have been detected by A. thaliana Columbia-0 (Col-0) transgenic plants. Overexpression of SoFLDH in A. thaliana was achieved using Agrobacterium tumefaciens strain GV101 harboring the transformation vector pB2GW7-SoFLDH. Twelve BASTA-resistant transgenic A. thaliana were generated with longer flowering stems (Fig 6a  and 6b). In contrast, the transgenic A. thaliana showed longer flowering stems with many flowers compare with wild-type (Fig 6a). Expression of the SoFLDH gene in positive transgenic A. thaliana was confirmed using semi-quantitative RT-PCR (Fig 6b). The transcription level of the transgenic plants were verified using Quantitative RT-PCR (Fig 6c). Leaves of twelve 35-day-old transgenic plants and wild types were sampled for RNA isolation and cDNA synthesis. The transgenic plants represent high expression of the SoFLDH gene than wild-type.

PLOS ONE
Moreover, we used Western blotting (WB) analysis for confirmed the transformation stability by SoFLDH gene, and protein expression success (Fig 6d). This finding confirmed that the SoFLDH gene was overexpressed successfully in A. thaliana. Based on the transcription level and WB analysis results, we select OE-SoFLDH-2, OE-SoFLDH-5 and OE-SoFLDH-7 for further analysis. Meanwhile the morphological analysis resulted in delayed in flowering formation in wild type compared to the transgenic plants (Fig 6a). In context, this results are in line with Ali et al., [3,4], which found that the overexpression of terpenoids and TPS synthesis genes,

Terpene contents in transgenic A. thaliana leaves
The metabolite was analyzed by GC-MS to recognized the unique terpenes that formed by transformation with the SoFLDH gene (S1 Fig). The mono-, sesqui-and diterpene peaks were obviously measured. The type and number of metabolites were displayed by the percentage of peak area (% peak area) ( Table 1). We are using the mass spectra libraries, reported references and the extracts of wild-type Arabidopsis which produce disparate amounts and types from terpenoids for identified the terpenes in wild type and transgenic A. thaliana. As expected, the leaves of transgenic A. thaliana plants emitted a high level of various terpenoids compare with leaves of wild-type. Furthermore, in transgenic A. thaliana leaves the diterpene compounds were reported as the main group after (30.49%), followed by sesquiterpenes group (15.46%) then one monoterpene compound (0.89%). While, in wild-type A. thaliana the diterpene compounds group was detected as the only group (13.12%) (S1   Table 1). The production of various terpene by overexpression of SoFLDH gene in A. thaliana, was getting previously by [4,29]. With more direct interest in our results, the SoFLDH was responsible for the production of various types of terpene specially sesquiterpene through the same common isoprenoid pathway in sesquiterpene biosynthesis. It is worth noting that our results are in lines with several previously evidence supported that various terpene synthases genes have ability to synthesize a number of metabolite simultaneously, like, carene synthases, (±)-linalool synthases, cineole synthases, myrcene synthase, β-amyrin synthases and terpinolene synthases [10,11,[46][47][48][49].

Conclusions
The present study highlighted to clone and functionally identify one of the narrowly expressed sesquiterpene synthase (SoFLDH), which is responsible for the production of NAD+-dependent farnesol dehydrogenase [EC:1.1.1.354] in S. officinalis. Overexpression SoFLDH in A. thaliana resulted in accelerating the growth and flowering of OE-SoFLDH -2, OE-SoFLDH-5 and OE-SoFLDH-7 transgenic lines. These three lines exhibited a high expression of the SoFLDH gene, which regulate the production of various terpenes. The various types of terpene especially sesquiterpene formed in these A. thaliana transgenic plants reveal the dexterity of A. thaliana for synthesizing the same product through the common mevalonate pathway of sesquiterpene biosynthesis. While, SoFLDH protein exhibits a strong sequence similarity to farnesol dehydrogenase (TEY48599.1) gene from Salvia splendens, in addition to other terpene alcohol dehydrogenases and benzyl alcohol dehydrogenases genes from different plant species. Whereas, SoFLDH protein contained four types from highly conserved motifs YxxxK, RXR, RR (X8) W, TGxxGhaG and four conserved domains in the ligands pocket, and these previous domains have overlapping entries with another important superfamily domain. Overall, these data revealed that the A. thaliana plant can strongly use as a successfully transformation system for study the terpene synthase genes in S. officinalis.